White Wine Project by Matthew Phillipson

This report explores a dataset describing the quality of 4,898 white wines based on the chemical properties of each wine.

Univariate Plots Section

## [1] 4898   13
## 'data.frame':    4898 obs. of  13 variables:
##  $ X                   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ fixed.acidity       : num  7 6.3 8.1 7.2 7.2 8.1 6.2 7 6.3 8.1 ...
##  $ volatile.acidity    : num  0.27 0.3 0.28 0.23 0.23 0.28 0.32 0.27 0.3 0.22 ...
##  $ citric.acid         : num  0.36 0.34 0.4 0.32 0.32 0.4 0.16 0.36 0.34 0.43 ...
##  $ residual.sugar      : num  20.7 1.6 6.9 8.5 8.5 6.9 7 20.7 1.6 1.5 ...
##  $ chlorides           : num  0.045 0.049 0.05 0.058 0.058 0.05 0.045 0.045 0.049 0.044 ...
##  $ free.sulfur.dioxide : num  45 14 30 47 47 30 30 45 14 28 ...
##  $ total.sulfur.dioxide: num  170 132 97 186 186 97 136 170 132 129 ...
##  $ density             : num  1.001 0.994 0.995 0.996 0.996 ...
##  $ pH                  : num  3 3.3 3.26 3.19 3.19 3.26 3.18 3 3.3 3.22 ...
##  $ sulphates           : num  0.45 0.49 0.44 0.4 0.4 0.44 0.47 0.45 0.49 0.45 ...
##  $ alcohol             : num  8.8 9.5 10.1 9.9 9.9 10.1 9.6 8.8 9.5 11 ...
##  $ quality             : int  6 6 6 6 6 6 6 6 6 6 ...
##        X        fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :   1   Min.   : 3.800   Min.   :0.0800   Min.   :0.0000  
##  1st Qu.:1225   1st Qu.: 6.300   1st Qu.:0.2100   1st Qu.:0.2700  
##  Median :2450   Median : 6.800   Median :0.2600   Median :0.3200  
##  Mean   :2450   Mean   : 6.855   Mean   :0.2782   Mean   :0.3342  
##  3rd Qu.:3674   3rd Qu.: 7.300   3rd Qu.:0.3200   3rd Qu.:0.3900  
##  Max.   :4898   Max.   :14.200   Max.   :1.1000   Max.   :1.6600  
##  residual.sugar     chlorides       free.sulfur.dioxide total.sulfur.dioxide
##  Min.   : 0.600   Min.   :0.00900   Min.   :  2.00      Min.   :  9.0       
##  1st Qu.: 1.700   1st Qu.:0.03600   1st Qu.: 23.00      1st Qu.:108.0       
##  Median : 5.200   Median :0.04300   Median : 34.00      Median :134.0       
##  Mean   : 6.391   Mean   :0.04577   Mean   : 35.31      Mean   :138.4       
##  3rd Qu.: 9.900   3rd Qu.:0.05000   3rd Qu.: 46.00      3rd Qu.:167.0       
##  Max.   :65.800   Max.   :0.34600   Max.   :289.00      Max.   :440.0       
##     density             pH          sulphates         alcohol     
##  Min.   :0.9871   Min.   :2.720   Min.   :0.2200   Min.   : 8.00  
##  1st Qu.:0.9917   1st Qu.:3.090   1st Qu.:0.4100   1st Qu.: 9.50  
##  Median :0.9937   Median :3.180   Median :0.4700   Median :10.40  
##  Mean   :0.9940   Mean   :3.188   Mean   :0.4898   Mean   :10.51  
##  3rd Qu.:0.9961   3rd Qu.:3.280   3rd Qu.:0.5500   3rd Qu.:11.40  
##  Max.   :1.0390   Max.   :3.820   Max.   :1.0800   Max.   :14.20  
##     quality     
##  Min.   :3.000  
##  1st Qu.:5.000  
##  Median :6.000  
##  Mean   :5.878  
##  3rd Qu.:6.000  
##  Max.   :9.000

This is a summary of the 13 variables that describe each of the 4,898 white wines. This also includes the structure of each variable in the dataset.

Quality

We see based on the distribution of quality it seems to be normal with a bulk of the wines having a quality between the 5-7 range. There are no wines with a quality score below 3, and a few wins with a quality score of 9.

Fixed Acidity

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.800   6.300   6.800   6.855   7.300  14.200

The median fixed acidity in the white wines is 6.800 g/dm^3. Most wines have an acidity ranging from 6.30 to 7.30. You can also see there is an outlier that has a fixed acidity over 14.0.

Volatile Acidity

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0800  0.2100  0.2600  0.2782  0.3200  1.1000

The distribution of the volatile acidity is skewed right with a median value of 0.2600. A majority of the volatile acidity ranges fall between 0.21 - 0.32. You can see there are a few outliers at 0.9, 1.0, and 1.1.

Citric Acid

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.2700  0.3200  0.3342  0.3900  1.6600

Most of the white wines have a citric acid ranging between 0.27 - 0.39 g/dm^3. The distribution is right skewed; however, you can see it peaks around the 0.48 range and has a few wines that has a citric acid value over 1.0.

Residual Sugar

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.600   1.700   5.200   6.391   9.900  65.800

The residual sugar distribution has a median values of 5.2 g/dm^3.It is a right skewed distribution with a long tail as you can there are multiple bars on the right skewing the data all the way to 65.8 g/dm^3.

##Chlorides

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00900 0.03600 0.04300 0.04577 0.05000 0.34600

The amount of chlorides in the white wines has a median value of 0.043 g/dm^3. It looks like a normal distribution around the peak but has a long tail on the right side as the maximum amount of chlorides in the dataset is 0.346 g/dm^3.

Free Sulfer Dioxide

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.00   23.00   34.00   35.31   46.00  289.00

The free sulfur dioxide concentrations distribution is also right skewed. The median value is 34.0 mg/dm^3 while the average value is 35.31 mg/dm^3.This is somewhat close however there is a huge gap between 145 and the max value of 289.0.

Total Sulfur Dioxide

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     9.0   108.0   134.0   138.4   167.0   440.0

The total sulfur dioxide distribution is close to symmetrical as it has a median value of 134 mg/ dm^3 and the mean value is 138.4 mg/ dm^3. We can there are a few outliers that have a total sulfer dioxide concentration higher that 275 mg/ dm^3.

Density

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.9871  0.9917  0.9937  0.9940  0.9961  1.0390

The density of the white wines does not vary a lot, as most of the values are between 0.9917 and 0.9940. The distribution is close to symmetrical but there is a wine that has a maximum density of 1.0390 g/ cm^3.

pH

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.720   3.090   3.180   3.188   3.280   3.820

All the wines have a low pH which means they are more acidic if the pH level is below 7 on a scale of 0-14.The distribution is symmetrical as the median value is 3.180 and the mean value is 3.188 which is almost exact.

Sulphates

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.2200  0.4100  0.4700  0.4898  0.5500  1.0800

The sulphates distribution is slightly right skewed. The median value of sulphates is 0.47. Most of the white wines have a concentration between 0.41 and 0.55.

Alcohol

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    8.00    9.50   10.40   10.51   11.40   14.20

The distribution of alcohol is right skewed. Based on the distribution the minimum alcohol level for white wine is 8%. The median value is 10.4% which is expected in white wines.

Univariate Analysis

What is the structure of your dataset?

The dataset has 13 variables that explain 4,898 different white wines. One variable ‘X’ actually just numbers the wines from 1 to 4,898.

What is/are the main feature(s) of interest in your dataset?

The main feature of the dataset that interests me is the quality rating of the wines.

What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

I believe all the chemical tests may add support to the investigation. Each factor contributes to the overall flavor and quality of the wine. Some variable are may have a strong correlation such as total sulfur dioxide and free sulfur dioxide.

Did you create any new variables from existing variables in the dataset?

No new variables were created from the existing variables.

Of the features you investigated, were there any unusual distributions?

There were no unusual distributions in the dataset. The dataset was already tidy which makes it ideal to use in this situation.

Bivariate Plots Section

Fixed Acidity vs. Quality

## [1] "Median of fixed.acidity by quality:"
## white_wine$quality: 3
## [1] 7.3
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 6.9
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 6.8
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 6.8
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 6.7
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 6.8
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 7.1

We see a slight downward trend of higher quality with higher fixed acidity. We see that for quality ranging from 4-8 the acidity level the median values stay between 6.7 - 6.9. For the extreme cases of quality either a 3 or 9 the acidity levels are above 7.0 which show that acidity levels do not have a huge impact on quality.

Volatile Acidity vs. Quality

## [1] "Median of volatile.acidity by quality:"
## white_wine$quality: 3
## [1] 0.26
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.32
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.28
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.25
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.25
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.26
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.27

Based on the distribution we can see a trend where a lower volatile acidity looks to mean a higher wine quality. This can be seen with some of the classes with lower observations between the 6-8 quality range.

Citric Acid vs. Quality

## [1] "Median of citric.acid by quality:"
## white_wine$quality: 3
## [1] 0.345
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.29
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.32
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.32
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.31
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.32
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.36

We see that a higher citric acid seems to mean a higher quality wine. A wine quality of 4 has a median citric acid level of 0.29 g/ dm^3 while the quality of wines from 5-8 have a median level between 0.31 and 0.32. A quality level of 9 has a median citric acid value of 0.36 g/ dm^3.

Residual Sugar vs. Quality

## [1] "Median of residual.sugar by quality:"
## white_wine$quality: 3
## [1] 4.6
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 2.5
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 7
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 5.3
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 3.65
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 4.3
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 2.2

## [1] "Median of residual.sugar by quality:"
## white_wine$quality: 3
## [1] 4.6
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 2.5
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 7
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 5.3
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 3.65
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 4.3
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 2.2

After getting a better look at the distribution we see that residual sugar has a low impact in the quality of wine. We see there are peaks and troughs. For example a wine quality of 6 has a median residual sugar level of 5.3 g/ dm^3, a wine quality of 7 the median residual sugar level drops to 3.65 g/dm^3, the it picks back up at a wine quality of 8.

Chlorides vs. Quality

## [1] "Median of chlorides by quality:"
## white_wine$quality: 3
## [1] 0.041
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.046
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.047
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.043
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.037
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.036
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.031

## [1] "Median of chlorides by quality:"
## white_wine$quality: 3
## [1] 0.041
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.046
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.047
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.043
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.037
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.036
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.031

There is a slight relation between chlorides and quality. The less chlorides there are the higher the quality.

Free Sulfur Dioxide vs. Quality

## [1] "Median of free.sulfur.dioxide by quality:"
## white_wine$quality: 3
## [1] 33.5
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 18
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 35
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 34
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 33
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 35
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 28

The wines that have a quality level between 5-8 seem to have a higher free sulfur dioxide than a quality level of 4 or 9.

Coming from the dataset description, SO2 is mostly undetectable in wine in low concentrations, but at free SO2 concentrations over 50 ppm (~ 50 mg/ dm^3), SO2 becomes evident in the nose and taste of wine.

Total Sulfur Dioxide vs. Quality

## [1] "Median of total.sulfur.dioxide by quality:"
## white_wine$quality: 3
## [1] 159.5
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 117
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 151
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 132
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 122
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 122
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 119

The total sulfur dioxide has a similar relation as the free sulfur dioxide.The middle quality levels of 5-8 have a higher concentration than a level of 4 or 9. There is however a steady decrease in total sulfur dioxide concentrations from quality level 5 to higher levels.

Density vs. Quality

## [1] "Median of density by quality:"
## white_wine$quality: 3
## [1] 0.994425
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.9941
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.9953
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.99366
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.99176
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.99164
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.9903

We see that lower density means overall higher quality. There is a slight uptick in density levels between a wine quality level of 4 to 5, then it decreases as the quality level increases.

pH vs. Quality

## [1] "Median of pH by quality:"
## white_wine$quality: 3
## [1] 3.215
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 3.16
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 3.16
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 3.18
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 3.2
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 3.23
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 3.28

We see that as the pH level increases the quality increases as well. We will check the correlations between pH levels and acidity to see if there is a strong correlation.

Sulphates vs. Quality

## [1] "Median of sulphates by quality:"
## white_wine$quality: 3
## [1] 0.44
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.47
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.47
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.48
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.48
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.46
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.46

## [1] "Median of sulphates by quality:"
## white_wine$quality: 3
## [1] 0.44
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 0.47
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 0.47
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 0.48
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 0.48
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 0.46
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 0.46

We see that the sulphates concentration increases slightly as the quality increases; however, it does have a small drop to a concentration of 0.46 g/dm^3 at a quality level of 8 and 9.

Alcohol vs. Quality

## [1] "Median of alcohol by quality:"
## white_wine$quality: 3
## [1] 10.45
## ------------------------------------------------------------ 
## white_wine$quality: 4
## [1] 10.1
## ------------------------------------------------------------ 
## white_wine$quality: 5
## [1] 9.5
## ------------------------------------------------------------ 
## white_wine$quality: 6
## [1] 10.5
## ------------------------------------------------------------ 
## white_wine$quality: 7
## [1] 11.4
## ------------------------------------------------------------ 
## white_wine$quality: 8
## [1] 12
## ------------------------------------------------------------ 
## white_wine$quality: 9
## [1] 12.5

Other than a small drop in the quality rating of 5, as the alcohol content increases so does the quality of wine.

Acidity and pH

We can see as the pH increases the fixed acidity drops as the wines approach a neutral pH level of 7.

Correlation coefficient:

## 
##  Pearson's product-moment correlation
## 
## data:  pH and log10(fixed.acidity)
## t = -33.783, df = 4896, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4572280 -0.4117972
## sample estimates:
##        cor 
## -0.4347892

We see there is a weak negative correlation regarding fixed acidity and pH levels.

We can see the citric acid does not have a relation with the pH levels.

The volatile acidity does not have a relation with the pH level either.

Density, Sugar, and Alcohol Content

We expect the density of wine to be close to that of water 1 g/cm^3 however it depends on the sugar content and alcohol included in it.

We see there is an increase in density as the residual sugar increases.

While there is a decrease in density as the alcohol content increases.

## 
##  Pearson's product-moment correlation
## 
## data:  residual.sugar and alcohol
## t = -35.321, df = 4896, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4726723 -0.4280267
## sample estimates:
##        cor 
## -0.4506312

I was surprised with the correlation, I expected to see a stronger correlation between the alcohol content and the residual sugars. The reason behind this expectation is because alcohol is formed from the fermentation of sugars in grapes in regards to wine.

We are not aware of what grapes were used as each type of grape may yield different sugar contents.

Sulphates and Sulfur Dioxide

Sulphates are wine additives that contribute to sulfur dioxide gas levels.

## 
##  Pearson's product-moment correlation
## 
## data:  total.sulfur.dioxide and sulphates
## t = 9.5019, df = 4896, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1069590 0.1619585
## sample estimates:
##       cor 
## 0.1345624
## 
##  Pearson's product-moment correlation
## 
## data:  free.sulfur.dioxide and sulphates
## t = 4.1508, df = 4896, p-value = 3.369e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.03126264 0.08707928
## sample estimates:
##        cor 
## 0.05921725

Based on the correlation coefficient we see there is almost no relation between sulphate levels and sulfur dioxide.

Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?

We see that the higher the wine quality is impacted by the amount of residual sugars and alcohol in it. I have documented the correlation coefficients of the other variables in regards to quality below.

##                             [,1]
## fixed.acidity        -0.08448545
## volatile.acidity     -0.19656168
## citric.acid           0.01833273
## residual.sugar       -0.08206979
## chlorides            -0.31448848
## free.sulfur.dioxide   0.02371338
## total.sulfur.dioxide -0.19668029
## density              -0.34835102
## pH                    0.10936208
## sulphates             0.03331897
## alcohol               0.44036918

Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

I was surprised that there was not a stronger relationship between the residual sugars and alcohol levels. Mainly because alcohol comes from the fermentation of sugars.

What was the strongest relationship you found?

The variable that had the strongest relationship with quality was the alcohol content level.

Multivariate Plots Section

Correlation Matrix

To assist with this section we will go ahead and make a correlation matrix:

Alcohol, Citric Acid, Residual Sugar, and Quality

Out of the various variables, alcohol and density strongly correlate with quality .

These plots show the steps I used to include the relationship of a third variable in the plots of quality compared to alcohol.Alcohol is on the x-axis as it has the strongest correlation with quality. The plots show the positive correlation between alcohol and quality while also showing the weak correlations with pH and residual sugar.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

The plots show that as the alcohol levels increased along with a lower level of residual sugar increased the overall quality level.

Were there any interesting or surprising interactions between features?

I was surprised that with a higher amount of citric acid and higher alcohol content does not exactly mean a higher quality score.


Final Plots and Summary

Plot One

Description One

We see an big impact of the alcohol level on the quality of wines. For the quality ranges from 3-5 there is a dip, but as the alcohol level increases after the slight dip the quality rating jumps.

Plot Two

Description Two

We can see the distribution of the fixed acidity concentration across the pH levels. As the fixed acidity levels decrease the pH levels increase. This makes sense as the pH scale is from 0 to 14, 0 being very acidic like battery acid to 7 which is neutral similar to water, all the way up to 14 which is aklaline close to a drain cleaner fluid.

Plot Three

Description Three

We see that there is an impact on the residual sugar concentration and density. The more residual sugar the higher the density which is also evident by the outliers on the graph.


Reflection

The project was an enjoyable opportunity to apply the knowledge from learning R into a real word application. Starting from just comparing the one variable at a time to multivariate plots show that some variables impact each other which in turn impacts the quality. Some struggles I went through was correcting some of the errors that popped up when running the code. Something that went well was pulling the data and analyzing the data in various columns such as the quantiles, mean, and median for the alcohol level.

Some future work that could be done would be to incorporate the red wine dataset in this exploration to see if the quality of red wines is impacted the same as white wines or if different variables change the impact.